Classical Planning in MDP Heuristics: with a Little Help from Generalization

نویسندگان

  • Andrey Kolobov
  • Mausam
  • Daniel S. Weld
چکیده

Heuristic functions make MDP solvers practical by reducing their time and memory requirements. Some of the most effective heuristics (e.g., the FF heuristic function) first determinize the MDP and then solve a relaxation of the resulting classical planning problem (e.g., by ignoring delete effects). While these heuristic functions are fast to compute, they frequently yield overly optimistic value estimates. It is natural to wonder, then, whether the improved estimates of using a full classical planner on the (non-relaxed) determinized domain will provide enough gains to compensate for the vastly increased cost of computation. This paper shows that the answer is “No and Yes”. If one uses a full classical planner in the obvious way, the cost of the heuristic function’s computation outweighs the benefits. However, we show that one can make the idea practical by generalizing the results of classical planning successes and failures. Specifically, we introduce a novel heuristic function called GOTH that amortizes the cost of classical planning by 1) extracting basis functions from the plans discovered during heuristic computation, 2) using these basis functions to generalize the heuristic value of one state to cover many others, and 3) thus invoking the classical planner many fewer times than there are states. Experiments show that GOTH can provide vast time and memory savings compared to the FF heuristic function — especially on large problems.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Extending Classical Planning Heuristics to Probabilistic Planning with Dead-Ends

Recent domain-determinization techniques have been very successful in many probabilistic planning problems. We claim that traditional heuristic MDP algorithms have been unsuccessful due mostly to the lack of efficient heuristics in structured domains. Previous attempts like mGPT used classical planning heuristics to an all-outcome determinization of MDPs without discount factor ; yet, discounte...

متن کامل

Solving Non-deterministic Planning Problems with Pattern Database Heuristics

Non-determinism arises naturally in many real-world applications of action planning. Strong plans for this type of problems can be found using AO* search guided by an appropriate heuristic function. Most domain-independent heuristics considered in this context so far are based on the idea of ignoring delete lists and do not properly take the non-determinism into account. Therefore, we investiga...

متن کامل

Heuristics and Symmetries in Classical Planning

Heuristic search is a state-of-the-art approach to classical planning. Several heuristic families were developed over the years to automatically estimate goal distance information from problem descriptions. Orthogonally to the development of better heuristics, recent years have seen an increasing interest in symmetry-based state space pruning techniques that aim at reducing the search effort. H...

متن کامل

PAC optimal MDP planning with application to invasive species management

In a simulator-defined MDP, the Markovian dynamics and rewards are provided in the form of a simulator from which samples can be drawn. This paper studies MDP planning algorithms that attempt to minimize the number of simulator calls before terminating and outputting a policy that is approximately optimal with high probability. The paper introduces two heuristics for efficient exploration and a...

متن کامل

New Heuristics for Timeline-Based Planning

The timeline-based approach to planning represents an effective alternative to classical planning in complex domains where different types of reasoning are required in parallel. The iLoC domainindependent planning system takes inspiration from both Constraint Programming (CP) and Logic Programming (LP). By solving both planning and scheduling problems in a uniform schema, iLoC is particularly s...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2010